Tests overhaul by crusaderky · Pull Request #143 · explosion/cython-blis

crusaderky · 2025-12-08T17:59:24Z

This PR thoroughly revisits the test suite. It's made of several commits, each with individual comments:

Tweak test tolerances
- Tighten tolerance for float64 from 1e-3 ~ 1e-4 to 1e-9.
- Relax tolerance for float32 to 1e-2, as increasing the number of valid examples lead to the discovery of dotv: float32 rounding error is 10x of NumPy (dotv: float32 rounding error is 10x of NumPy #142).
- Swap actual <-> desired in assert_almost_equal, as the two are not symmetric.
Tweak strategies ranges
- Add test coverage for arrays of size 1.
- Remove unused, confusing default parameters from custom strategies.
Add central control for number of examples tested
This is meant to be tampered with locally for more thorough (slower) tests.
New test for dotv invalid use case
Mirrors same test in test_gemm.py.
Overaul hypothesis strategies
- Improve readability.
- Given infinite hypothesis examples, this commit does not introduce any functional changes.
  However, given a fixed and relatively low max_examples setting, it substantially increases test coverage:
  1. It prevents examples that were previously skipped by assume(). According to pytest --hypothesis-show-statistics, these were ~10% for dotv and ~20% for gemm.
  2. It prevents examples that ended up being duplicates due to trimming. For example, in dotv, hypothesis could previously generate examples e.g. A=[1,2,3,4], B=[5,6] and A=[1,2], B=[5,6,7,8]; both would get trimmed to A=[1,2], B=[5,6].
  3. In gemm, it removes an entire degree of freedom by removing unused variable out_col, which again would result in duplicate examples.
Add threading tests for shared input
Given input arrays that are shared between multiple threads, test that you can run dotv and gemm on them in parallel from multiple threads.

Tighten tolerance for float64 from 1e-3 ~ 1e-4 to 1e-9. Relax tolerance for float32 to 1e-2, as increasing the number of valid examples lead to the discovery of dotv: float32 rounding error is 10x of NumPy (explosion#142). Swap actual <-> desired in assert_almost_equal, as the two are not symmetric.

Add test coverage for arrays of size 1. Remove unused, confusing default parameters from custom strategies.

This is meant to be tampered with locally for more thorough (slower) tests.

Mirrors same test in test_gemm.py.

Improve readability. Given infinite hypothesis examples, this commit does not introduce any functional changes. However, given a fixed and relatively low max_examples setting, it substantially increases test coverage: 1. It prevents examples that were previously skipped by assume(). According to pytest --hypothesis-show-statistics, these were ~10% for dotv and ~20% for gemm. 2. It prevents examples that ended up being duplicates due to trimming. For example, in dotv, hypothesis could previously generate examples e.g. A=[1,2,3,4], B=[5,6] and A=[1,2], B=[5,6,7,8]; both would get trimmed to A=[1,2], B=[5,6]. 3. In gemm, it removes an entire degree of freedom by removing unused variable out_col, which again would result in duplicate examples.

crusaderky · 2025-12-08T18:07:06Z

 # Copyright ExplosionAI GmbH, released under BSD.
-import numpy as np
-
-np.random.seed(0)


This did nothing: https://hypothesis.readthedocs.io/en/latest/reference/strategies.html#hypothesis.strategies.random_module

Hypothesis always seeds global PRNGs before running a test, and restores the previous state afterwards.

crusaderky added 6 commits December 8, 2025 17:48

Tweak strategies ranges

820905c

Add test coverage for arrays of size 1. Remove unused, confusing default parameters from custom strategies.

Add central control for number of examples tested

3d0f9ad

This is meant to be tampered with locally for more thorough (slower) tests.

New test for dotv invalid use case

0fe793e

Mirrors same test in test_gemm.py.

Add threading tests for shared input

555d35f

crusaderky marked this pull request as ready for review December 8, 2025 18:06

crusaderky commented Dec 8, 2025

View reviewed changes

crusaderky merged commit 14781ae into explosion:main Dec 8, 2025
64 checks passed

crusaderky deleted the tests-overhaul branch December 8, 2025 21:17

rgommers mentioned this pull request Feb 19, 2026

dotv: float32 rounding error is 10x of NumPy #142

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tests overhaul#143

Tests overhaul#143
crusaderky merged 6 commits into
explosion:mainfrom
crusaderky:tests-overhaul

crusaderky commented Dec 8, 2025 •

edited

Loading

Uh oh!

crusaderky Dec 8, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

crusaderky commented Dec 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

crusaderky Dec 8, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

crusaderky commented Dec 8, 2025 •

edited

Loading